Exposing Homograph Obfuscation Intentions by Coloring Unicode Strings

نویسندگان

  • Wenyin Liu
  • Anthony Y. Fu
  • Xiaotie Deng
چکیده

Unicode has become a useful tool for information internationalization, particularly for applications in web links, web pages, and emails. However, many Unicode glyphs look so similar that malicious guys may utilize this feature to trick people’s eyes. In this paper, we propose to use Unicode string coloring as a promising countermeasure to this emerging threat. A coloring algorithm is designed and prototyped to assign colors to a set of required languages/scripts such that each language/script is displayed uniquely in color, while the color difference among different languages is maximized. Based on that, we proposed both fixed and adaptive coloring schemes to render Unicode strings in weblinks and documents so as to distinguish mixed Unicode characters from different language/script groups and vividly illustrate potential Homograph Obfuscation intentions. Our user study shows that it is helpful to remind end users of weirdly displayed strings.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preparation, Enforcement, and Comparison of Internationalized Strings Representing Nicknames

This document describes methods for handling Unicode strings representing memorable, human-friendly names (called "nicknames", "display names", or "petnames") for people, devices, accounts, websites, and other entities. This document obsoletes RFC 7700.

متن کامل

Non-standard word and homograph resolution for asian language text analysis

In this paper we present a general model for text analysis of Asian languages (Chinese and Japanese). That is a method for mapping strings of characters to strings of identified trivially pronounceable words. This work is based on the English NonStandard Word analysis model suitably augmented to deal with both the lack of spaces between words in Japanese and Chinese and addressing the issues of...

متن کامل

Draft Paul Hoffman draft

This document describes a framework for preparing Unicode text strings in order to increase the likelihood that string input and string comparison work in ways that make sense for typical users throughout the world. The stringprep protocol is useful for protocol identifier values, company and personal names, internationalized domain names, and other text strings. This document does not specify ...

متن کامل

Automatic Detection for JavaScript Obfuscation Attacks in Web Pages through String Pattern Analysis

Recently, most of malicious web pages include obfuscated codes in order to circumvent the detection of signature-based detection systems .It is difficult to decide whether the sting is obfuscated because the shape of obfuscated strings are changed continuously. In this paper, we propose a novel methodology that can detect obfuscated strings in the malicious web pages. We extracted three metrics...

متن کامل

Indistinguishable Predicates: A New Tool for Obfuscation

Opaque predicates are a commonly used technique in program obfuscation, intended to add complexity to control flow and to insert dummy code or watermarks. We survey a number of methods to remove opaque predicates from obfuscated programs, hence defeating the intentions of the obfuscator. Our main contribution is an obfuscation technique that introduces opaque constant predicates that are provab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008